Reinforcement Learning With Human Feedback - How To Train And Fine-Tune Transformer Models